Large Language Models Discriminate Against Speakers of German Dialects

Minh Duc Bui; Carolin Holtermann; Valentin Hofmann; Anne Lauscher; Katharina von der Wense

doi:10.18653/v1/2025.emnlp-main.415

Large Language Models Discriminate Against Speakers of German Dialects

Minh Duc Bui, Carolin Holtermann, Valentin Hofmann, Anne Lauscher, Katharina von der Wense

Abstract

Dialects represent a significant component of human culture and are found across all regions of the world. In Germany, more than 40% of the population speaks a regional dialect (Adler and Hansen, 2022). However, despite cultural importance, individuals speaking dialects often face negative societal stereotypes. We examine whether such stereotypes are mirrored by large language models (LLMs). We draw on the sociolinguistic literature on dialect perception to analyze traits commonly associated with dialect speakers. Based on these traits, we assess the dialect naming bias and dialect usage bias expressed by LLMs in two tasks: association task and decision task. To assess a model’s dialect usage bias, we construct a novel evaluation corpus that pairs sentences from seven regional German dialects (e.g., Alemannic and Bavarian) with their standard German counterparts. We find that: (1) in the association task, all evaluated LLMs exhibit significant dialect naming and dialect usage bias against German dialect speakers, reflected in negative adjective associations; (2) all models reproduce these dialect naming and dialect usage biases in their decision making; and (3) contrary to prior work showing minimal bias with explicit demographic mentions, we find that explicitly labeling linguistic demographics—German dialect speakers—amplifies bias more than implicit cues like dialect usage.

Anthology ID:: 2025.emnlp-main.415
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8212–8240
Language:
URL:: https://aclanthology.org/2025.emnlp-main.415/
DOI:: 10.18653/v1/2025.emnlp-main.415
Bibkey:
Cite (ACL):: Minh Duc Bui, Carolin Holtermann, Valentin Hofmann, Anne Lauscher, and Katharina von der Wense. 2025. Large Language Models Discriminate Against Speakers of German Dialects. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 8212–8240, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Large Language Models Discriminate Against Speakers of German Dialects (Bui et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.415.pdf
Checklist:: 2025.emnlp-main.415.checklist.pdf

PDF Cite Search Checklist Fix data